Image pickup apparatus having auto-focus control and image pickup method

ABSTRACT

To track a region of interest (“ROI”) such as, for example, in a viewfinder of a camera, both the location and distance of a point in the ROI are estimated for a time in the future, based on actual measurements of locations and distances of the point at the present and in the past. Focus is controlled so as to focus on the point of interest based on the estimate of the future location and distance.

FIELD OF THE INVENTION

[0001] The present invention relates to imaging devices having auto-focusing function and/or imaging method

BACKGROUND OF THE INVENTION

[0002] The relevant prior art for Auto-focusing with motion estimation has just estimating the motion of object from the distance information. Therefore, its estimation has been limited to the motion of objects which moves along an optical axis. Japanese Patent Laid-Open No. H07-288732 states the device to store the shape of the object of interest, and track it using the shape as a template. However, they do not use the distance information of the object of interest and also the device does not estimate the future motion of the object of interest. Therefore they only can track and focus very slow objects. In addition, they do not mention how to acquire and store the information about the object of interest, and does not describe how to track the object.

SUMMARY OF THE INVENTION

[0003] The present invention is to provide improved auto-focus device and/or auto-focus method.

[0004] The invention is to provide the auto-focus device and/or auto-focus method having superior speed and precision of object tracking, to provide the auto-focus device and/or auto-focus method for estimating and tracking the motion of region of interest (ROI) with continuously focusing the object automatically and to provide auto-focus device and/or auto-focus method having simple user interface for tracking the objects.

[0005] The present invention estimates the position of the ROI and the distance to it using the projected locations of the ROI in the finder and the real distances to it, whose locations and distances have been acquired through the past and the present time. In addition, the camera system are controlled with the result of estimation so that the user can focus on the ROI continuously.

[0006] The present invention reads out the pixel related to ROI, faster than that of the remaining image.

[0007] The present invention corrects errors of the estimation using the temporal continuity of the distance information.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0008]FIG. 1 shows the first example of the present invention. Camera system is represented conceptually as 101, and the system consists of a lens 102, a half mirror 109, an area image sensor 103, an auto focus sensor 104, an processor 105 which processes the signal from both sensors and an image processor 106 which processes the signal from the image sensor 103 and yields to a final image signal 107. This picture is conceptual drawing, hence this figure does not limits the configuration of the system. Processor 105 and 106 are shown as functional block, and does not necessarily mean that they are physically arranged in such a manner. 108 is also the conceptual drawing of a part of the ray coming to the camera system, a part of it goes to the auto focus sensor 104 and the rest does to the area image sensor 103.

[0009] Processor 105 receives a part of image corresponding to ROI and estimate its position in the camera finder at a future time n+1 using the sequence of positions at time n, n−1, n−2, . . . of ROI in the camera finder. Details will be described later.

[0010] Meanwhile, processor 105 receives a distance to the ROI from autofocus sensor, and estimates its distance at a future time n+1, using the sequence of past distance at time n, n−1, n−2, . . . . Here, we can use any method to obtain the distance to the ROI which located in arbitral position in the finder. Examples are area autofocus technology which has been already in production and used by CANON EOS 1V, or also edge detecting auto-focus technology used extensively in present digital video cameras.

[0011] A camera system is controlled with the estimated position and distance so that the camera can focus on ROI in the future time n+1.

[0012] Here, the estimation has been done in discrete time, therefore the approximation can only hold around time n and n+1. When there is a long interval between n and n+1, the estimation cannot be exact between the midpoint of the interval so that sometimes we can not focus around ROI properly. For example, if we assume that the area image sensor is read out with standard {fraction (1/30)} second per entire flame, our estimation is valid only if the ROI stays in the given depth of field of the camera for {fraction (1/30)} second. Processor can interpolate the position and distance between one period (here, {fraction (1/30)}), however it is not enough to estimate the position and distance of ROI with higher speed.

[0013] To manage such a caste the image of ROI should preferably be read out with high temporal frequency. The frequency depends on the speed of the motion of the ROI. For example, if the image of ROI is read out 1000 times a second, the camera system can control the focus system of the camera lens 1000 times a second, without any interpolation. This frame rate is really higher than the response time of the Lens therefore we need to consider the delay of the feedback loop into account to stabilize the system,

[0014]FIG. 7 shows the control flow of the invention. The control consists of the initialization phase 701 and the loop phase 702. First of all, the ROI is specified at 703. It is not possible to track the ROI of arbitrary motion unless the system knows the past positions, therefore the system just obtains the information of the position in the finder and the distance to the object without tracking operation for the very first two frames.

[0015] The system acquires the initial position at 704, the initial distance at 705 and the position and the distance corresponding to the next time at 706, 707, respectively. Having sufficient information for motion estimation through the initialization process, the first estimation is done at 708, which is the estimation for the coming next frame. On the loop phase 702, the camera system is controlled at 709 in accordance with the estimation result. This feedback operation achieves continuous focusing on ROI. However, the estimation usually has the error to the real position and distance. The error is measured at 710. At 711, the position and the distance is repetitively estimated using the present position and distance, and the past position and distance.

[0016] In this example, the method to specify ROI is not limited to one case. For example, the method that a user specifies ROI by a pointer and a button is preferred. Or, it is better to the system to have image recognition capability which enables the system to specify ROI automatically referring to the color, intensity and/or pattern of the image including ROI.

[0017] This example requires initialization phase in the control flow. However, initialization is not necessary when the motion of ROI is empirically obvious and therefore estimation can be started from the very first frame. Also, only the position and distance information of one frame before is used for estimation or more than one frames in the past can be used for the estimation. We need to redesign initialization phase in case we use the plural frames.

[0018] The initialization phase is not limited to this example. It is preferred to include an exception handling when user specified ROI which is hard or impossible to track.

[0019] The loop phase is not limited to this example. Error recovery test and the abnormal termination function in case of tracking failure should be added preferably.

[0020]FIG. 2 shows the second example of the present invention. The first example used the area image sensor for tracking ROI, however in this example, autofocus sensor provides both ROI capturing function and estimation function. The detail of the figure is as follows. The same numbers are attached to the function block which is identical to FIG. 1. Auto focus sensor 201 acquires the position of and the distance to ROI by the time n, and give the information to processor 105. The processor 105 controls the lens system 102 using the information above, therefore the ROI is continuously focused. 202 is a half-mirror normally used for the current Single Lens Reflex camera, which passes a part of incoming light to the autofocus sensor, and the rest of the light to the viewfinder. 203 shows the mechanical shutter unit. This figure is not to limit the embodiment of the example.

[0021] Specifically, auto focus sensor 201 has to do the auto focus operation and the image acquisition almost at the same time. Therefore, Japanese Patent Laid-Open No. 2001-250931 can be used preferably in this example. When the system is designed so that only one dimensional motion for example, only horizontal, or only vertical motion) has to be estimated, the image acquisition of ROI can be done by linear auto focus sensor, such as described in Japanese Patent Laid-Open No. H09-43502. The embodiment of this example is not limited to the, application of the above mentioned sensors The autofocus sensor which is possible to obtain the position information in a finder and the distance information is sufficient for the realization of this example.

[0022] Usually the signal used for auto focus operation has to have high signal to noise ratio. For example, if the position information are acquired by an auto focus sensor with the frame rate of 1000 frames per second, because of short exposure time, the signal to noise ratio sometimes becomes too low for autofocus operation, depending of an illumination, in such a case, averaging in time-domain yields to effective high signal to noise ratio, reducing random noise in the averaged signal. This averaging enables the system to focusing correctly with estimating the next position even when the illumination is low.

[0023]FIG. 3 shows the third example of the present invention. The same numbers are attached to the block which is identical to FIG. 1. In the second example, auto-focus sensor had the estimation function, in this example, the sensor for tracking ROI 301 is located at the finder block. The incoming light ray reflected at half mirror 202 goes to the sensor 301, with another reflection at the half mirror 303 of penta-prism block 302. 301 acquires the location information as the image sensor in the first example does, and the lens system 102 is controlled in accordance with the result of the estimation. The sensor 301 can be designed so that it has both ROI image acquisition function and auto-exposure sensor function. In this case, the sensor 301 should preferably have a higher image acquisition rate than the area image sensor 103. For image acquisition for auto-exposure in low signal to noise ratio, which sensor 301 needs to lake into account, it is resolved by the same method as it is done in the second example, i.e., by the averaging in the time-domain.

[0024] The tracking of ROI is not always possible. The system sometimes lose the ROI, or track the wrong ROI in case if ROI is occluded by another object or the ROI has deforming property (such as eyes). When such cases happen, the errors can be corrected by the assumption that the distance does riot change drastically. FIG. 4 describes the example.

[0025]FIG. 4 is the conceptual drawing of the distance to the ROI for a given time. Dotted line in FIG. 4 is the example when the system switched to focus on wrong ROI and the information corresponding to the wrong ROI was picked up at time n. It is not possible to detect whether the tracing is erroneous or not at that time. If the system believes the wrong position and control the lens system in accordance with wrong distance corresponding to the wrong position, the lens system has to be significantly controlled. The significant error makes the actual ROI out of focus with the current invention, the lens system is not controlled in such a manner, The lens is controlled with referring the distance information of the time after time n. Following is an example of the lens control of the present invention.

[0026] In the present invention, the distance at time n is not directory fed back to the lens system. Instead, the lens focuses on the distance d1 at time n, which is within a reasonable range of a linear interpolation. At time n+1, the system tests the distance to ROI again.

[0027] As you can see on the solid line of FIG. 4 as an example, it tracking again becomes normal state i.e., if the distance to ROI at time n+1 is within a reasonable range expected from the history, the system finds that the distance information at time n is wrong, The system is able to continue tracking and estimation operation after the time n+1. On the contrary, if the distance to ROI at time is closer to the distance obtained at time n as is seen on the dashed line on FIG. 4, the system knows the significant change of the distance is real motion of the object, and the control the lens in accordance with the distance information at time n+1.

[0028]FIG. 8 is an example flow of the present invention. At 801 the position and distance of the ROI at the time n+1 is estimated. The lens is controlled at 802 using the estimation. After the control the difference between the estimation and the real data is calculated. More specifically, the distance to ROI is measured at 803. The comparison between the previous distance and the current distance is done at 804 in order to test whether the distance is gradually changed. If it is gradually changed the position is acquired at 805, however, if the distance is known to be changed significantly at 804, then the feature of the current image of ROI and that of the previous image of ROI are stored on a memory at 806, and the distance is estimated and interpolated without using the measured distance of time n+1. Then the position information is sampled at 805, however since two ROIs has been stored as a candidate, system measures two distance corresponding to the two ROIs. For each distance information, system tests which of the distance can be regarded as the result of reasonable transition. If one of the ROI has the reasonable distance, use the distance for lens system control. If neither of them is not in reasonable range, store current image of ROI again, and interpolate the distance at n+2 without current distance information. Measuring the position and distance of ROI of lime n+2 at 811, and control the camera system at 812

[0029] Here, the definition of continuity is, for example, that the continuity is lost if the absolute value of derivative of the distance in time exceeds a certain limit. The threshold of the limit is a design parameter, and it is also possible that user can specify the value by himself.

[0030] When the continuity is lost for several frames, it suggests that it is impossible to recover tracing of ROI, therefore trace and estimation should be aborted with error signal. Here, the number of frames allowed to operate without continuity is a design parameter. If user want to track the ROI of high speed, the number should be made small,

[0031] The continuity about distance and position are tested at FIG. 8. However it is also possible to test only one of them.

[0032]FIG. 5 shows the fifth example of the present invention. The motions of surrounding features around ROI are able to be assumed to have continuity. Tracking error can be recovered using this assumption. Point 501 is the point which represents ROI, and point 502 to 505 are points around point 501 to be used to recover the tracking error. For example, if point 501 disappears at time n+1, as shown in FIG. 5 b), or it shows peculiar motion, normally we will lose the ROI and tracking in no longer possible. By the present invention, unless point 502 to 505 keeps the reasonable shape (in this example, square-like shape), we can presume that the point 501 is inside the box. We can wait until ROI recovers by tracking the polygon created by point 502 to 505.

[0033]FIG. 9 is an example of the control flow. The points for error recovery and the point corresponding to ROI are both processed parallel in time, and their position and distance are calculated at 901 and 902. At 903, the point corresponding to ROI is tested whether it is reasonably moving or not. The criterion for reasonability is, for example, that the point is located inside the polygon created by the points for error recovery

[0034] If the above test fails, it suggests that features of ROI itself is also not correct. In this case the ROI of previous frame is again recalled Since the tracking was successful at that time, this feature of ROI is more reliable than that of the current frame. At the next frame, search the match with the recalled ROI to recover tracking. If the ROI appears in the reasonable location, it can be said that the error is recovered. Then the system continues to track ROI and estimate the feature location.

[0035] When the ROI is lost, none to say, the motion is not reasonable. Then the estimated position of the point is calculated at 904 and the position is set at the estimated location.

[0036] When the point corresponding to ROI and some of the points for error recovery shows peculiar motion, it suggests that the wrong tracking has been made due to some noise in an input image. In this case, we can recover the tracking and estimation by searching ROI around all the points for error recovery, using template matching, matched filter and so on, for example, and picking up the region which shows the best match to ROI.

[0037]FIG. 10 is an example of control flow. The motion of each points for error recovery are tested at 1001. If one of the point shows discontinuity, the system repeat the tracking of the point again at 1002. At 1002, dedicative search of ROI is done on the image with the assumption that the each distance between ROI and the point for error recovery is nearly constant in a given frame. At 1003, the search result is examined whether it is reasonable or not, and the proper point is chosen as a new ROI if it exists.

[0038] Here, the definition of “proper polygon” formed by points for error recovery is just a design parameter. Only square shape can be allowed if strict tracking is required. Deformed trapezoid can be allowed if loose tracking is permitted.

[0039] Here the number of the points for error recovery is 4, however it is not limited to that number; and other numbers, like 2, 6, Or odd number can also be acceptable. For example, when it is more likely to lose the ROI, we can place many points around ROI so that we can still track the ROI even if most of the points are lost as for extrapolation of the center of ROI using points for error recover, the technique to calculate the center of gravity of the points, or to calculate the center of gravity of the points after removing the peculiar points which parts from the group, can be used for instance.

[0040] As for the way to search the points for error recovery, we can search ROI uniformly around the area where there was ROI of the previous frames. Or, start search from the expected area which is given by the relative distance between ROI and the points for recovery of the previous frame.

[0041] One example to define 1101 is the use of color information. For instance, when the face is the object to be tracked, the center of gravity of the area of skin color region can be chosen as a ROI, and its motion can be tracked and estimated. The other color, for example, red ball, or orange fruits, can also the chosen as the object. The point other than the center of gravity of the ROI can also be chosen as the point of interest.

[0042] Another example to define ROI is, to use the intensity pattern of the ROI. For this purpose, we can use Lucas-Kanade Method, Template matching, Szeliski's spline-based method can be used. None to say, the algorithm is not limited to these examples.

[0043] Here is an example of the method to estimate the location of the ROI using previous locations. We can use the linear extrapolation of the point from the previous and current locations. It is also possible to estimate position from the speed or the acceleration of the previous and current motion. Another example is to use Kalman Filter, Extended Kalman Filter or Unscented Kalman Filter, which yields to optimum estimation.

[0044]FIG. 9 is another example of the present invention. The ROI is regarded as ideal “point” in the above, however, it merely is the real point. When the image of ROI is captured by area image sensor with high resolution, it is hard to trace only one point corresponding to single pixel. Normally the point is selected as a region which consists of several pixels. The size of the region depends on the size and the texture of the object of interest. For example, if the object of interest is located close to the camera, and the object has smooth graduation-like texture, the region has to be large if the object is located far from the camera, or the object has a lot of spacially-high frequency component, the region should be set small. However, it is really hard for user to control the region in real time.

[0045] The present invention has realized to assist user by an automatic region selection method. FIG. 6 explains it in detail. FIG. 6(a) is the case when the object is relatively large in the finder. When the user Select the point of interest 601, then the region grows from the point as a center until the magnitude of the spatial difference for X-direction and Y-direction becomes sufficiently large for tracking. In FIG. 6(b), the region specified by dotted line 602 does not have the magnitude enough to track, therefore it needs be enlarged. In FIG. 6(c), the region 603 now has the enough magnitude for tracking, namely, the information about the edge of the object to be tracked, then the further enlargement of the region is stopped. The region can be either updated on each frame, or stay the same size across frames. The region can be expanded to only X-direction, Y-direction or both. The magnitude of the spatial difference for X-direction and Y-direction can be measured by the absolute sum or squared sum of the spatial difference, and it is compared with a certain criteria. This function can be assigned to the new button on the camera, or a partiall-depressing of the shutter button

[0046] Here, the point 601 can be moved and placed at any location on the finder by user operation. Also it is better to have the method to display both the point and the image acquired by the image sensor to the preview window so that the user can now specify the point of interest using preview monitor. 

What is claimed is:
 1. An image pickup apparatus comprising: tracking means for automatically tracking a predetermined point of interest; estimating means for, based on each information on locations and distances of the point of interest at present, n, and in past, n−1, estimating a location and a distance of the point of interest in future, n+1; and controlling means for controlling to focus on the point of interest based on an output of said estimating means.
 2. The apparatus according to claim 1, wherein said controlling means controls to focus on the point of interest by using a signal from at least a part of an area image sensor.
 3. The apparatus according to claim 1, wherein said controlling means controls to focus on the point of interest by using a signal from at least a part of a focus sensor.
 4. The apparatus according to claim 1, wherein said controlling means controls to focus on the point of interest by using a signal front at least a part of a light metering sensor.
 5. The apparatus according to claim 2, wherein said controlling means discards signals of a region which is not used for the tracking and for reading out at high speed a region to be tracked for the point of interest.
 6. The apparatus according to claim 5, wherein said control means discards the signal of non-neighboring of the point of interest.
 7. The apparatus according to claim 1, further comprising correcting means for correcting errors based on temporal continuity of the distance information.
 8. The apparatus according to claim 1, further comprising correcting means for correcting errors based on spatial continuity of motion of the point.
 9. The apparatus according to claim 1, wherein said controlling means controls by using object color information as tracking information.
 10. The apparatus according to claim 1, wherein said controlling means controls by using object illumination information.
 11. The apparatus according to claim 1, wherein said controlling means detects motion of the point of interest using a time differential component and a space differential component of illumination information around of the point of interest.
 12. The apparatus according to claim 1, wherein said controlling means estimates the motion of the point of interest by using template matching.
 13. The apparatus according to claim 1, wherein said controlling means estimates the motion of the point of interest by using Kalman Filter.
 14. The apparatus according to claim 1, further comprising specifying means for specifying the point of interest on a finder or a preview monitor; and calculating means for calculating to select a region of interest based on the specified point of interest.
 15. A method for an image pickup, comprising the steps of: automatically tracking a point of interest; based on each information on lotions and distances of the point of interest at present, n, and in past, n−1, estimating both a location and a distance of the point of interest in future, n+1; and controlling to focus on the point of interest based on the estimated location and distance of the point.
 16. The method according to claim 15, wherein a signal from at least a part of an area image sensor is used for controlling.
 17. The method according to claim 15, further comprising the step of correcting errors based on temporal continuity of the distance information of the point of interest.
 18. The method according to claim 15, further comprising the step of correcting errors based on spatial continuity of motion of the point of interest.
 19. The method according to claim 15, further comprising the steps of: specifying the point of interest on a finder or a preview monitor; and calculating to select a region of interest based on the specified point of interest.
 20. A computer readable medium having recorded thereon a computer program implementing the steps of defined in claim
 15. 